Personalized Spam Filtering for Gray Mail

نویسندگان

  • Ming-Wei Chang
  • Scott Yih
  • Robert McCann
چکیده

Gray mail, messages that could reasonably be considered either spam or good by different email users, is a commonly observed issue in production spam filtering systems. In this paper we study this class of mail using a large real-world email corpus and signaturebased campaign detection techniques. Our analysis shows that even an optimal filter will inevitably perform unsatisfactorily on gray mail, unless user preferences are taken into account. To overcome this difficulty we design a light-weight user model that is highly scalable and can be easily combined with a traditional global spam filter. Our approach is able to incorporate both partial and complete user feedback on message labels and catches up to 40% more spam from gray mail in the low false-positive region.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Personalized E-mail Filtering System Based on Usage Control

In order to cope with the problem of spam soaring, a personalized e-mail filtering method based on UCON is proposed. E-mails from different senders were classified as junk e-mail, suspicious e-mail and normal email by trust third-party according to the maintained blacklist and embedded machine learning technology online. Suspicious e-mails will be classified further from users’ point of view ma...

متن کامل

Combining Global and Personal Anti-Spam Filtering

Many of the first successful applications of statistical learning to anti-spam filtering were personalized classifiers that were trained on an individual user’s spam and ham e-mail. Proponents of personalized filters argue that statistical text learning is effective because it can identify the unique aspects of each individual’s e-mail. On the other hand, a single classifier learned for a large...

متن کامل

SpamCooling: A Parallel Heterogeneous Ensemble Spam Filtering System Based on Active Learning Techniques

Anti-spam technology is developing rapidly in recent years. With the emerging applications of machine learning in diverse fields, researchers as well as manufacturers around the world have attempted a large number of related algorithms to prevent spam. In this paper, we designed an effective anti-spam protection system, SpamCooling, based on the mechanism of active learning and parallel heterog...

متن کامل

An E-mail Authentication and Disposable Addressing Scheme for Filtering Spam

The number of spam mails has spread rapidly in recent years. Currently, the most common spam filtering solutions include blacklisting and content filtering, as well as the Bayesian approach, which uses a Bayesian filter to analyze mail content to generate classifiers. However, spammers can forge their addresses or include additional information that will mislead the filtering system or mark leg...

متن کامل

Towards Symbiotic Spam E-mail Filtering

This position paper discusses the use of symbiotic filtering, a novel distributed data mining approach that combines contentbased and collaborative filtering for spam detection.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008